39 research outputs found

    Improving Searchability of Automatically Transcribed Lectures Through Dynamic Language Modelling

    Get PDF
    Recording university lectures through lecture capture systems is increasingly common. However, a single continuous audio recording is often unhelpful for users, who may wish to navigate quickly to a particular part of a lecture, or locate a specific lecture within a set of recordings. A transcript of the recording can enable faster navigation and searching. Automatic speech recognition (ASR) technologies may be used to create automated transcripts, to avoid the significant time and cost involved in manual transcription. Low accuracy of ASR-generated transcripts may however limit their usefulness. In particular, ASR systems optimized for general speech recognition may not recognize the many technical or discipline-specific words occurring in university lectures. To improve the usefulness of ASR transcripts for the purposes of information retrieval (search) and navigating within recordings, the lexicon and language model used by the ASR engine may be dynamically adapted for the topic of each lecture. A prototype is presented which uses the English Wikipedia as a semantically dense, large language corpus to generate a custom lexicon and language model for each lecture from a small set of keywords. Two strategies for extracting a topic-specific subset of Wikipedia articles are investigated: a naïve crawler which follows all article links from a set of seed articles produced by a Wikipedia search from the initial keywords, and a refinement which follows only links to articles sufficiently similar to the parent article. Pair-wise article similarity is computed from a pre-computed vector space model of Wikipedia article term scores generated using latent semantic indexing. The CMU Sphinx4 ASR engine is used to generate transcripts from thirteen recorded lectures from Open Yale Courses, using the English HUB4 language model as a reference and the two topic-specific language models generated for each lecture from Wikipedia

    Improving searchability of automatically transcribed lectures through dynamic language modelling

    Get PDF
    Recording university lectures through lecture capture systems is increasingly common. However, a single continuous audio recording is often unhelpful for users, who may wish to navigate quickly to a particular part of a lecture, or locate a specific lecture within a set of recordings. A transcript of the recording can enable faster navigation and searching. Automatic speech recognition (ASR) technologies may be used to create automated transcripts, to avoid the significant time and cost involved in manual transcription. Low accuracy of ASR-generated transcripts may however limit their usefulness. In particular, ASR systems optimized for general speech recognition may not recognize the many technical or discipline-specific words occurring in university lectures. To improve the usefulness of ASR transcripts for the purposes of information retrieval (search) and navigating within recordings, the lexicon and language model used by the ASR engine may be dynamically adapted for the topic of each lecture. A prototype is presented which uses the English Wikipedia as a semantically dense, large language corpus to generate a custom lexicon and language model for each lecture from a small set of keywords. Two strategies for extracting a topic-specific subset of Wikipedia articles are investigated: a naïve crawler which follows all article links from a set of seed articles produced by a Wikipedia search from the initial keywords, and a refinement which follows only links to articles sufficiently similar to the parent article. Pair-wise article similarity is computed from a pre-computed vector space model of Wikipedia article term scores generated using latent semantic indexing. The CMU Sphinx4 ASR engine is used to generate transcripts from thirteen recorded lectures from Open Yale Courses, using the English HUB4 language model as a reference and the two topic-specific language models generated for each lecture from Wikipedia. Three standard metrics – Perplexity, Word Error Rate and Word Correct Rate – are used to evaluate the extent to which the adapted language models improve the searchability of the resulting transcripts, and in particular improve the recognition of specialist words. Ranked Word Correct Rate is proposed as a new metric better aligned with the goals of improving transcript searchability and specialist word recognition. Analysis of recognition performance shows that the language models derived using the similarity-based Wikipedia crawler outperform models created using the naïve crawler, and that transcripts using similarity-based language models have better perplexity and Ranked Word Correct Rate scores than those created using the HUB4 language model, but worse Word Error Rates. It is concluded that English Wikipedia may successfully be used as a language resource for unsupervised topic adaptation of language models to improve recognition performance for better searchability of lecture recording transcripts, although possibly at the expense of other attributes such as readability

    Low Resource, Post-processed Lecture Recording from 4K Video Streams

    Get PDF
    Many universities are using lecture recording technology to expand the reach of their teaching programs, and to continue instruction when face to face lectures are not possi- ble. Increasingly, high-resolution 4K cameras are used, since they allow for easy reading of board/screen context. Unfortunately, while 4K cameras are now quite affordable, the back-end computing infrastructure to process and distribute a multitude of recorded 4K streams can be costly. Furthermore, the bandwidth requirements for a 4K stream are exorbitant - running to over 2GB for a 45-60 minute lecture. These factors mitigate against the use of such technology in a low-resource environment, and motivated our investigation into methods to reduce resource requirements for both the institution and students. We describe the design and implementation of a low resource 4K lecture recording solution, which addresses these problems through a computationally efficient video processing pipeline. The pipeline consists of a front-end, which segments presenter motion and writing/board surfaces from the stream and a back-end, which serves as a virtual cinematographer (VC), combining this contextual information to draw attention to the lecturer and relevant content. The bandwidth saving is realized by defining a smaller fixed-size, context-sensitive ‘cropping window’ and generating a new video from the crop regions. The front-end utilises computationally cheap temporal frame differencing at its core: this does not require expensive GPU hardware and also limits the memory required for processing. The VC receives a small set of motion/content bounding boxes and applies established framing heuristics to determine which region to extract from the full 4K frame. Performance results coupled to a user survey show that the system is fit for purpose: it is able to produce good presenter framing/context, over a range of challenging lecture venue layouts and lighting conditions within a time that is acceptable for lecture video processing

    A Virtual Cinematographer for Presenter Tracking in 4K Lecture Videos

    Get PDF
    Lecture recording has become an important part of the provision of accessible tertiary education and having good autonomous recording and processing systems is necessary to make it feasible. In this work, we develop and evaluate a video processing framework that uses 4K video to track the lecturer and frame him/her in a way that simulates a human camera operator. We also investigate general issues pertaining to blackboard usage and its influence on cinematography decisions. We found that post-processing produced better tracking and framing results when compared to some real-time approaches. Furthermore, the entire pipeline can run on a commodity PC and will complete within the suggested time of 300% of the input video length. In fact, our testing showed that 60% of the total processing time can be ascribed to I/O operations. With the removal of redundant reads and writes, this proportion can be reduced. Finally, some algorithms can be remapped to parallel versions which will exploit multicore CPUs or GPUs if these are available

    From Gatekeepers to Gateways: Courses Impeding Graduation Annual Report 2019

    Get PDF
    The Courses Impeding Graduation (CIG) Project is a research and development initiative of the Centre for Higher Education Development (CHED) addressing the problem of high failure rates in courses that are obstacles to student retention and progression. This report report lays out the background, aims, objectives, and outcomes of the project in 2019, with a particular focus on first-year Mathematics courses in the Faculty of Science, examining which students are at higher risk of failing these courses. The report includes student perspectives gathered through focus groups

    Identification of genetic variants associated with Huntington's disease progression: a genome-wide association study

    Get PDF
    Background Huntington's disease is caused by a CAG repeat expansion in the huntingtin gene, HTT. Age at onset has been used as a quantitative phenotype in genetic analysis looking for Huntington's disease modifiers, but is hard to define and not always available. Therefore, we aimed to generate a novel measure of disease progression and to identify genetic markers associated with this progression measure. Methods We generated a progression score on the basis of principal component analysis of prospectively acquired longitudinal changes in motor, cognitive, and imaging measures in the 218 indivduals in the TRACK-HD cohort of Huntington's disease gene mutation carriers (data collected 2008–11). We generated a parallel progression score using data from 1773 previously genotyped participants from the European Huntington's Disease Network REGISTRY study of Huntington's disease mutation carriers (data collected 2003–13). We did a genome-wide association analyses in terms of progression for 216 TRACK-HD participants and 1773 REGISTRY participants, then a meta-analysis of these results was undertaken. Findings Longitudinal motor, cognitive, and imaging scores were correlated with each other in TRACK-HD participants, justifying use of a single, cross-domain measure of disease progression in both studies. The TRACK-HD and REGISTRY progression measures were correlated with each other (r=0·674), and with age at onset (TRACK-HD, r=0·315; REGISTRY, r=0·234). The meta-analysis of progression in TRACK-HD and REGISTRY gave a genome-wide significant signal (p=1·12 × 10−10) on chromosome 5 spanning three genes: MSH3, DHFR, and MTRNR2L2. The genes in this locus were associated with progression in TRACK-HD (MSH3 p=2·94 × 10−8 DHFR p=8·37 × 10−7 MTRNR2L2 p=2·15 × 10−9) and to a lesser extent in REGISTRY (MSH3 p=9·36 × 10−4 DHFR p=8·45 × 10−4 MTRNR2L2 p=1·20 × 10−3). The lead single nucleotide polymorphism (SNP) in TRACK-HD (rs557874766) was genome-wide significant in the meta-analysis (p=1·58 × 10−8), and encodes an aminoacid change (Pro67Ala) in MSH3. In TRACK-HD, each copy of the minor allele at this SNP was associated with a 0·4 units per year (95% CI 0·16–0·66) reduction in the rate of change of the Unified Huntington's Disease Rating Scale (UHDRS) Total Motor Score, and a reduction of 0·12 units per year (95% CI 0·06–0·18) in the rate of change of UHDRS Total Functional Capacity score. These associations remained significant after adjusting for age of onset. Interpretation The multidomain progression measure in TRACK-HD was associated with a functional variant that was genome-wide significant in our meta-analysis. The association in only 216 participants implies that the progression measure is a sensitive reflection of disease burden, that the effect size at this locus is large, or both. Knockout of Msh3 reduces somatic expansion in Huntington's disease mouse models, suggesting this mechanism as an area for future therapeutic investigation

    Review of Open Educational Resources

    No full text
    This presentation provides an overview of OER in general as well as in South Africa

    Review of Open Educational Resources

    No full text
    This presentation provides an overview of OER in general as well as in South Africa
    corecore